Incremental and Adaptive Software Systems Development of Natural Language Applications

نویسندگان

  • Elena Lloret
  • Santiago Escobar
  • Manuel Palomar
  • Isidro Ramos
چکیده

Natural Language (NL) processing tools, such as tokenizers, part-ofspeech taggers or syntactic processors obtain knowledge from a set of documents (e.g., tokens, syntactic patterns, etc.) and produce the different elements that will take part on the discourse universe in a NL text (e,g., noun phrases, verbs, sentences, etc). In this paper, we present how NL software systems development can be performed incrementally by using a high-performance specification language like Maude. A generic algebraic specification for NL is defined, including sorts and subsorts apart from equational properties, such as associativity and commutativity for built-in lists and sets. Then, the full discourse universe, available for NL processing, is described in terms of the algebraic specification by providing a non-deterministic but terminating set of transformation rules. Finally, and as a proof of concept, a set of documents for NL processing is given to Maude as an input term and successfully transformed into a proper document, exploring all the non-deterministic possibilities, as well as resolving the ambiguity in language. The main advantages of implementing NL in this manner are: generality, transparency, extensibility, reusability, and maintainability. To the best of our knowledge, this is the first attempt to represent and develop complex NL software systems with this formal notation, and based on the analysis conducted, this implementation constitute the basis for the design and development of more specific NL processing applications, such as text summarization. †Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante, Apdo. de correos, 99, E-03080 Alicante, Spain e-mail: [email protected], [email protected] ‡Departamentos de Sistemas Informáticos y Computación Universidad Politécnica de Valencia Valencia, Spain e-mail: [email protected], [email protected]

منابع مشابه

A Review of Research in Behavioral Programming

Behavioral programming is an approach for non-intrusive incremental software development. Introduced through scenario-based programming in the language of live sequence charts (LSC), it is now implemented also in Java and in the functional programming language Erlang. Behavioral programming calls for constructing systems from threads of behavior, each of which independently represents (a part o...

متن کامل

Incremental Semantics Driven Natural Language Generation with Self-Repairing Capability

This paper presents the on-going development of a model of incremental semantics driven natural language generation (NLG) for incremental dialogue systems. The approach is novel in its tight integration of incremental goal-driven semantics and syntactic construction, utilizing Type Theory with Records (TTR) record types for goal concepts as its input and the grammar formalism Dynamic Syntax (DS...

متن کامل

Combining Incremental Language Generation and Incremental Speech Synthesis for Adaptive Information Presentation

Participants in a conversation are normally receptive to their surroundings and their interlocutors, even while they are speaking and can, if necessary, adapt their ongoing utterance. Typical dialogue systems are not receptive and cannot adapt while uttering. We present combinable components for incremental natural language generation and incremental speech synthesis and demonstrate the flexibi...

متن کامل

A Document-Oriented Approach to the Development of Knowledge Based Systems

ADDS (Approach to Document-based Development of Software) is an approach to the development of applications based on a document-oriented paradigm. According to this paradigm, applications are described by means of documents that are marked up using descriptive domain-specific markup languages. Afterwards, applications are produced processing these marked up documents. Formulation of domain-spec...

متن کامل

LearningPinocchio: adaptive information extraction for real world applications

The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013